Temporal Correlations of Local Network Losses 
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We introduce a continuum model describing data losses in a single node of a packet-switched 
network (like the Internet) which preserves the discrete nature of the data loss process. By con- 
struction, the model has critical behavior with a sharp transition from exponentially small to finite 
losses with increasing data arrival rate. We show that such a model exhibits strong fluctuations in 
the loss rate at the critical point and non-Markovian power-law correlations in time, in spite of the 
Markovian character of the data arrival process. The continuum model allows for rather general 
incoming data packet distributions and can be naturally generalized to consider the buffer server 
idleness statistics. 
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I. INTRODUCTION 

Complex networks underpin many diverse areas of sci- 
ence. They manifest themselves in relationships between 
network topology and functional organization of complex 
neuron structures interacting organic molecules de- 

scribing metabolic activity in Uving cells [3| , multi-species 
food webs 0, numerous aspects of social networks 
[1,0, IRS, and the connectivity and operation of the In- 
ternet [lO, [llL[l3l . New models of network topology such 
as scale-free [Ij] or small-world [14j] have been found to 
be surprisingly good at describing real- world structures. 
A consequence of the realisation that complex networks 
describe universal properties of many such problems has 
resulted in extensive research activity by the physics com- 
munity in the past decade (see Refs. \m [Tg} for reviews) . 

A problem of particular significance in many applica- 
tion domains is the resiliency of complex networks to 
the random or selective removal of nodes or links. For 
example, the loss of connectivity in scale-free networks 
[H O Q [S HI has implications on the tolerance of 
the Internet to protocol or equipment failures. Typically, 
the site or bond disorder acts as an input which makes 
them very general and applicable to a wide variety of 
networks. 

More recently there has been an increasing realization 
that network breakdowns can not only result from the 
physical loss of connectivity, but can arise due to the loss 
of data traffic in the network (i.e. congestion) [2l|, [2^ . 
However, only a few dynamical models of traffic in net- 
works have been considered to date [III [H, [H, . In 
the case of communication networks the excessive loading 
of even a single node can give rise to cascades of failures 
arising from traffic congestion and consequently isolate 
large parts of the network [23|. To describe the oper- 
ational failure arising due to congestion at a particular 
network node, one needs to account for distinct features 
of the dynamically 'random' data traffic which is the rea- 
son for such a breakdown. 

In this paper we model data losses in a single node of a 
packet-switched network like the Internet. There are two 



distinct features which must be preserved in this case: the 
discrete character of data propagation and the possibility 
of data overflow in a single node. In the packet-switched 
network data is divided into packets which are routed 
from source to destination via a set of interconnected 
nodes (routers). At each node packets are queued in a 
memory buffer before being serviced, i.e. forwarded to the 
next node (there are separate buffers for incoming and 
outgoing packets but we neglect this for the sake of sim- 
plicity) . Due to the finite capacity of memory buffers and 
the stochastic nature of data traffic, any buffer can be- 
come overflown which results in packets being discarded. 

We focus on a continuum description of the discrete 
process of data packet loss. Such a continuum model 
represents a simplification that preserves the salient fea- 
tures of the data loss mechanism, while at the same time 
it can be more easily embedded in a larger model describ- 
ing data packet losses in a large network. The continuum 
description allows us to overcome inevitable difficulties 
in incorporating realistic distributions of incoming traffic 
into a discrete-time class of models, like one we intro- 
duced earlier [23\ . On the contrary, the continuum model 
can easily incorporate a completely general distribution 
of packet lengths and inter- arrival times, both essential 
in modeling data loss in finite-sized buffers. 

We introduce a model where noticeable data losses in a 
single memory buffer start when the average rate of ran- 
dom packet arrivals approaches the service rate. Under 
this condition the model has a built-in sharp transition 
from free flow to lossy behavior with a sizeable fraction 
of arriving packets being dropped. A sharp onset of net- 
work congestion is familiar to everyone using the Internet 
and was numerically conflrmed in different models [28j . 
Here we stress that such a congestion originating from a 
single node is characterized by strong critical fluctuations 
of the data loss in the vicinity of the built-in transition. 

In particular, we will show that a Markovian input pro- 
cess can give rise to long-range temporal correlations of 
data losses that are strongly non-Markovian in the crit- 
ical regime. In the context of the Internet, this means 
that when excessive data losses start it is more probable 
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that they persist for a while, thus impacting on network 
operation. As we wiU discuss later in this paper, this non- 
Markovian behavior has a profound effect on the opera- 
tion of current Internet protocols, such as the Transport 
Control Protocol (TCP), that dictate how users experi- 
ence the network operation. 

While data loss is natural and inevitable due to data 
overflow, we show that loss rate statistics turn out to be 
highly nontrivial in the realistic case of a finite buffer, 
where at the critical point the magnitude of fluctuations 
can exceed the average value. The fluctuations still obey 
the central limit theorem but only in the unrealistically 
long time limit. The importance of fluctuations in some 
intermediate regime is a definitive feature of mesoscopic 
physics, albeit the reasons for this are absolutely different 
(note that even in the case of electrons, the origin of the 
mesoscopic phenomena can be either quantum or purely 
classical, see, e.g., [29|). 

The average loss rate and/or transport delays were 
previously studied, e.g., in the theories of bulk queues 
pol [3l| or a jamming transition in traffic flow [33|. 
What makes present considerations intrinsically differ- 
ent from these theories is the very nature of the quan- 
tity we consider: the losses (not existing in flow models) 
make the description of the traffic process essentially non- 
Hermitian. Although fluctuations in network dynamics 
were previously studied (see, e.g. [mill]), this was done 
through measurements or numerical simulations of data 
traffic. 

Due to the symmetry of the continuum description of 
a buffer with respect to its full (lossy) and empty (idle) 
states, we also derive corresponding expressions for the 
statistics of idleness of the buffer server (i.e. output links 
from routers). This quantity is essential in determining 
the way the statistics of data traffic going into a subse- 
quent buffer along a data path are shaped. This is self- 
evidently important when we are attempting to describe 
the operation of an entire network. 



II. THE MODEL 

We consider a single finite-size memory buffer fed with 
a random data-packet stream. It stores the packets and 
then is serviced by the data-link that sends this pack- 
ets further along the network on a first-in-first-out basis. 
This adequately models the output buffer attached to the 
switching device in the router. The speed of the input 
line of the buffer is much bigger than the speed of the 
output line. The reason is that the input comes from the 
switching fabric of a router which is designed to operate 
very fast indeed in order to feed a large number of such 
buffers, but sequentially. The capacity of the output line 
is normally smaller. 

Hence, we can model the packet arrival as an instanta- 
neous renewal process. The storage capacity of the buffer 
is L, measured in bits. The lengths of arriving packets are 
treated as random, all being much smaller than L. The 



service rate (i.e. the rate at which packets depart from 
the buffer) is considered to be deterministic, as random- 
ness in it is negligible as compared to that of the input 
traffic. We normalize the lengths of packets p, the speed 
of the output link rout and the queue length i by the size 
of the buffer L (which is henceforth set to 1). 

The procedure for the renewal cycle is described as 
follows: at the moment of arrival of a packet of size p, 
the state of the queue is £, this is followed by the gap 
77 (random inter-arrival time) until the next arrival. We 
introduce the time scale required to empty a full buffer 
provided there are no new arrivals, 770 = l/^out- If ^+P < 
1 then the packet joins the queue and the queue length 
prior the next arrival is £' ^ £ + p — rj/rjo ii £' > and 
£' = otherwise. If £+p > 1 then the packet is discarded 
and the queue length prior the next arrival is £' = £—ri/rio 
if ^' > and £' = otherwise. 

Since the maximum packet size is much less than 1 
(the buffer size) and assuming that the average incoming 
traffic rate rin (also normalized to the buffer size) is close 
to the service rate: 



,770 - 1| <Cl 



(1) 



we can treat p, rj and £ as continuous variables. 

Our aim is to calculate the statistics of the amount of 
the dropped traffic and the service lost due to idleness 
of the output link during time t ^ fj [f] is the average 
inter-arrival time) in the regime ([T]). In this regime and 
for observation times t ^ fj, the system can be described 
by the Fokker-Planck equation as follows (in terms of the 
transitional probability density function w{£',t;£)) 

dtw{£', t; £) = -adtw{£', t; £) ^ ^a^dlw{£', t; £) , (2) 

where a and cr^ are average moments of the change of the 
queue size per unit time 

a^i^(A^), a'^^^{M^), At ^ (3) 

and the following boundary and initial conditions are im- 
posed 



J{£',t;£)\ 



e'=o.i 







w{£',t;£)\,^„ = 6{£' -£) 



(4) 



(5) 



where 



J{£', t; £) = aw{£', t; £) - -a'^di>w{£', t; £) (6) 

is the probability current. By At — > in eq. ([3]) we mean 
that At is much smaller than the observation time, but 
large enough so that the underlying stochastic processes 
can be considered as continuous: 



77 < At < t 



(7) 
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The solution of (|2I4I5P can be expressed as follows 

exp [-{An^P + v'^)t] 



(8) 

X [27rfccos(27rM') +wsin(27rH')] 
X [27rfc cos(27rM) + u sin(27rH)] 



where 



(9) 



Note that the solution ([H]) can be expressed in terms of 
0-functions. 

For the Laplace transform of w{t' , t; £) we have 



W{i',e;l) = £rw{e',t;e) 



1 ev(e'-e) 

2 K sinh(K) 



X i — cosh[K(/ + £-!)] + — sinh[K(f + £-!)] 



+ cosh[K(|£' - £| - 1)] + cosh[K(f + ^ - 1)] 



where 



(10) 
(11) 



From pop we have for the probabilities of returning to 
the boundaries 

W{Q, e; 0) = i [k cotanh(K) - v] 

I (12) 

W{1, e; 1) = - [k cotanh(K) + v] 
e 

These will be used in the next section. 



III. STATISTICS OF LOSSES 

In this section we concentrate on the statistics of the 
losses due to the buffer overflowing. The corresponding 
formulae for the statistics of the server idleness can be 
obtained using transformation i —^ 1 — i,v —^ —v. 

First, we estimate the size of fluctuations of the losses 
on a time scale t In order to do that we con- 

sider the dynamics of the system near the boundary £ — 1 
which is governed by the following transitional probabil- 
ity: 



1 



V27raH 



exp 









jexp 


2aH 


+ exp 



a 

— exp 



2a(l-£') 



erfc 



2^ 



(2 - f - ey 

-i' -£ + at' 



(13) 



V2aH 



which is the solution of ([2]) when the boundary £ = 
is sent to — oo. The change in the state of the system 
during time t can then be represented as follows: 

A£{t) = £' ~£^ A£o{t) + A^ioss(/, i; £) (14) 

where A^o (t) is the change in the state of the system if 
there was no boundary, its statistics is determined by 



{A£o{t))^at , ([A£o{t)]^)=<jH + oit) 



(15) 



and A£ioss{£' , t; £) is the amount of traffic lost due to 
buffer overflowing. The moments of (jlSp can be defined 
as follows 

{[A£{t)r) = jd£'d£ {£' - £rwo{e', t; £)p{£) (16) 

where p(£) is the stationary distribution of buffer occu- 
pancy. 

For the first two moments (fT6|) in the limit i ^ we 
have 

{A£{t))=at + ^p{l) , {[A£{t)f)=aH (17) 
From (|14I15I17P we can conclude that 



{A£,Ut)) = ^P(I) 
([A^ioss(i)]') + 2(A£o(<)A^io,,(i)) = o(i) 



(18) 



The first of the relations (US]) means that A^ioss(^', t'l^) is 
non-zero only if 1 in the limit t — > 0. The second 

relation means either 



[A^ioss(t)]'), (A£o(t)A^ioss(0) = o{t) (19) 
AAoss(0 = -2A4(i) + o[Vt) (20) 



or 



The relation pp)) does not make sense physically, so in 
what follows we accept option and show that it is 
consistent with the later calculations. 

Next we lift the restriction t ^ 2 /cr^ . It can be shown 
that the conditional moments (with the condition that 
the system was in the state £ at the beginning of the 
observation interval) can be expressed as follows: 



m{„l(i; £) = fc!rL. \{ I «^(1> " 1) 

X w{l,ti]£) , tk+i = t 
where w{£',t;£) is determined by ^ and 
1 



(21) 



lim 

f^o t 



noss = iim T / d£' / d£ A£ios«(^', t; £) 



= lim - 

t^o t 



d£'d£ {£' ~ £ - at)wo{£',t;£) = 



(22) 
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For unconditional moments in the stationary regime we 
have 



k ^'+1 k-i 



(23) 



where r is defined in ([9]) and p{t} is the stationary solu- 
tion of ©: 



2ve 



2vl 



q2v _ I 



(24) 



To calculate ?7i|Qss(i) we consider its Laplace transform: 



" k 1 (25) 
= klp{l)[W{l,e;l)t-'^ 



where W{l,e; 1) is is defined by ((TI 

Taking now the inverse Laplace transform we have 



m\^Ut)^ ^7' Millie) 



1 



7+100 



dee-Mi';i(e) (26) 



7 — 100 



From we obtain 

"^[oiW=P(l)r = p(l)4^ 



(27) 



For the moments (j25p with /c > 1 we can identify the 
following regimes: 



(28) 



Correspondingly, for the moments in i-representation we 
have 



^oslW = <('""'^'r[(fc + 3)/2] ■ (29) 
[/(1)t'=^ r>l 



/ T-(fe+l)/2 



Now we calculate the PDF pioss(a^; t) of the amount of 
the lost traffic, x, during time t. To calculate it we con- 
sider its characteristic function in the e-representation: 



Pioss(s; e) = C^Piossix; e) , 

-Ploss(*^5^) = -^rPloss (-^i ^) 



(30) 



From (l30ll we obtain 



^ oc 



dx x^ CTP\oss{x;t) 



k=l 



(31) 



where 



00 

^'loss(e) = 'CrPloss(t) , Ploss(0 = J dx pioss{x , t) (32) 



with 1 — pioss(i) being the probability for the system not 
to drop a single packet over the period of time t. Substi- 
tuting ((25|) into ([3T|) we have 

Pioss(s; e) = Pioss(e) + ^ 5](-s)'=[iy(l, e; l)]'=-i 



floss (e) + 



Ml) 



e2W^(l,e;l) 



fe=i 
-1 - 



1 



l + sW^(l,e;l) 



In order that Pioss(s;e) did not have an abnormal be- 
haviour (in particular, it did not contain terms like S{x)), 
we must assume that 



floss (e) 



P(l) 



Hence, 

floss (a;; e) 



p(l) 



■ exp 



T^(l,e;l) 



(33) 



(34) 



Integrating this relation over x, we recover psp . which 
shows that our assumption is indeed correct. 
In the regimes of short and long times we have 



Pioss{x;t) 



p(l)erfc 



X — Tp{l) 



T < 1 

T > 1 



(35) 



and 



Ploss(i) 



/4r 

p{l)\ — r<l 

V TT 



(36) 



1 



r > 1 



The conditional PDF (with the condition that the system 
dropped at least one packet during the time t) can be 
defined as follows 



Piossix;t) 

Ploss(i) 



— ertc 
4r 



X — Tp{l) 



T < 1 

r > 1 
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The central moments can be calculated in the same 
way as (P^ . Here we will consider only the variance of 
the losses crfossi^) ^^'^ hmit t ^ 1: 



2 

"loss 



cotanh|w| — sinh 




(37) 



This is essentially in agreement with the result of consid- 
erations in Ref. 23! where a simple discrete-time model for 
studying losses in a single buffer was introduced. In that 
model packets of fixed size arrive with probability p at 
the equidistant time epochs. The service was determinis- 
tic, and half of packet was served between the successive 
time epochs. In spite of such oversimplification, the dis- 
crete model has delivered quantitatively the same results 
which indicates the universality of the approach. 

Finally, we calculate the correlator of the fluctuations 
of losses measured during two time intervals of length ti 
and t2 correspondingly and separated by the time T: 



1 

corr(i 1 , t2 , T) = j p{hM.T) ~ m« {h )m« [h ) 



where 



p{ti,t2,T) 

tl t2 







with rioss defined in ([22)1 . 

In the regime T ^ ti,t2 and T ^ 2 ja^ it can be shown 
that 



corr(ii,i2,r)_ 







(38) 



as we would expect. In fact, the correlator goes to zero 
exponentially if w 7^ 0. In the opposite regime 2/(7^ ^ 
T :S> ti,t2 we have 



1 



p{l) V TTCT^T 



(39) 



which is again in agreement with the results of the 
discrete-time considerations 1231. 



IV. DISCUSSION AND CONCLUSION 

As we would expect intuitively, loss events separated 
widely in time are uncorrelated as shown by equa- 
tion (|38p . By widely separated in time, we mean that 
the time separation of the two observation intervals in 



which losses occur is much longer than the time over 
which fiuctuations of queue length become comparable 
or much bigger than the buffer size itself, i.e. 2/ct^. 

However, in the case when the separation time is much 
smaller than 2/ct^, the correlations of loss fiuctuations 
are decaying very, very slowly, as can be seen from equa- 
tion pQ]) . Such time intervals are likely to be comparable 
or even smaller than the round trip times for typical TCP 
connections. TCP is the protocol that controls the rate 
at which data is sent across a network, between a par- 
ticular source and destination. The exact details of the 
congeestion control operation of TCP can be found in 
[3^ . For our purposes we shall only focus on its salient 
congestion control features and the implications of the 
result of equation p9|) on it. 



TCP limits its sending rate as a function of the per- 
ceived network congestion. It operates on a virtual con- 
trol loop of sending packets, receiving acknowledgements 
and estimating the round trip time. Once a packet is 
lost, the sender cuts its transmission rate by half. If no 
other loss is detected it increases its sending rate linearly 
by a small increment. But if a subsequent loss event is 
detected it cuts its transmission rate in half again. If 
successive loss events occur, which according to equa- 
tion ([59]) is likely on the relevant time scale, the reduc- 
tion in transmission rate can be dramatic and potentially 
unnecessary. As there are multiple TCP connections ex- 
periencing losses at the same buffer this will lead to a 
cycle of rapid under-usage and slow convergence to con- 
gestion, which is clearly undesirable and ineffective. 

Studying of spatial correlations of loss fluctuations over 
a network is work in progress. This will help us quantify 
the second significant aspect of TCP operation which is 
its reaction to time-out events, as this is connected to cor- 
related losses and delays around the sequence of buffers 
forming each control loop. 

To conclude, we emphasize that the stability of a net- 
work with respect to data loss was mostly analyzed in 
the past from the viewpoint of the loss of physical con- 
nectivity in the network topology where a failure of a 
given node or link was treated as a (probabilistic) input 
into a network model. Here we have studied dynamical 
fiuctuations in data loss in a single node (memory buffer) 
of the network. We have shown that the strong fluctu- 
ations and long-time memory in losses inevitably follow 
from the discrete character of signal propagation in the 
packet-switched networks. This single-node fluctuations 
can potentially trigger a cascade of failures in neighbor- 
ing nodes and thus result in a temporal failure of large 
parts of the network. In the next stage, we intend to 
utilize these features of the local data loss as dynamical 
inputs into the network and thus study possible abrupt 
increase of data loss in the network triggered by a local 
overload. 
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